Constructing parse forests that include exactly the n-best PCFG trees

نویسندگان

  • Pierre Boullier
  • Alexis Nasr
  • Benoît Sagot
چکیده

This paper describes and compares two algorithms that take as input a shared PCFG parse forest and produce shared forests that contain exactly the n most likely trees of the initial forest. Such forests are suitable for subsequent processing, such as (some types of) reranking or LFG fstructure computation, that can be performed ontop of a shared forest, but that may have a high (e.g., exponential) complexity w.r.t. the number of trees contained in the forest. We evaluate the performances of both algorithms on real-scale NLP forests generated with a PCFG extracted from the Penn Treebank.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimating Probabilities for an Indonesian Stochastic Parser using the Inside Outside Algorithm

This paper presents work in constructing a Probabilistic Context Free Grammar (PCFG) parser for Indonesian. Due to the unavailability of a large manually parsed corpus, we start from an existing symbolic parser to parse a relatively small collection of Indonesian sentences. A PCFG language model is extracted, ignoring explicit linguistic information encoded in feature structures, and is subsequ...

متن کامل

An Effective Framework for Chinese Syntactic Parsing

This paper presents an effective framework for Chinese syntactic parsing, which includes two parts. The first one is a parsing framework, which is based on an improved bottom-up chart parsing algorithm, and integrates the idea of the beam search strategy of N best algorithm and heuristic function of A* algorithm for pruning, then get multiple parsing trees. The second is a novel evaluation mode...

متن کامل

Generalized Queries on robabilistic Context-

Probabilistic context-free grammars (PCFGs) provide a simple way to represent a particular class of distributions over sentences in a context-free language. Efficient parsing algorithms for answering particular queries about a PCFG (i.e., calculating the probability of a given sentence, or finding the most likely parse) have been applied to a variety of pattern-recognition problems. We extend t...

متن کامل

Parsing with PCFGs and Automatic F-Structure Annotation

The development of large coverage, rich unification(constraint-) based grammar resources is very time consuming, expensive and requires lots of linguistic expertise. In this paper we report initial results on a new methodology that attempts to partially automate the development of substantial parts of large coverage, rich unification(constraint-) based grammar resources. The method is based on ...

متن کامل

Generalized Queries on Probabilistic Context-Free Grammars

Probabilistic context-free grammars (PCFGs) provide a simple way to represent a particular class of distributions over sentences in a context-free language. Efficient parsing algorithms for answering particular queries about a PCFG (i.e., calculating the probability of a given sentence, or finding the most likely parse) have been developed, and applied to a variety of patternrecognition problem...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009